{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "multi-class_classification_of_handwritten_digits.ipynb",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [],
      "collapsed_sections": [
        "266KQvZoMxMv",
        "6sfw3LH0Oycm",
        "copyright-notice"
      ]
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "copyright-notice",
        "colab_type": "text"
      },
      "source": [
        "#### Copyright 2017 Google LLC."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "copyright-notice2",
        "colab_type": "code",
        "cellView": "both",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "outputs": [],
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0
    },
    {
      "metadata": {
        "id": "mPa95uXvcpcn",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " # Clasificaci\u00f3n de d\u00edgitos escritos a mano mediante redes neuronales"
      ]
    },
    {
      "metadata": {
        "id": "Fdpn8b90u8Tp",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ![img](https://www.tensorflow.org/versions/r0.11/images/MNIST.png)"
      ]
    },
    {
      "metadata": {
        "id": "c7HLCm66Cs2p",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " **Objetivos de aprendizaje:**\n",
        "  * entrenar un modelo lineal y una red neuronal para clasificar d\u00edgitos escritos a mano del conjunto de datos de [MNIST](http://yann.lecun.com/exdb/mnist/) cl\u00e1sico\n",
        "  * comparar el rendimiento de los modelos de clasificaci\u00f3n lineal y de redes neuronales\n",
        "  * visualizar las ponderaciones de una capa oculta de una red neural"
      ]
    },
    {
      "metadata": {
        "id": "HSEh-gNdu8T0",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Nuestro objetivo es asignar cada imagen de entrada al d\u00edgito num\u00e9rico correcto. Crearemos una red neural con algunas capas ocultas y una capa de softmax en la parte superior para seleccionar la clase ganadora."
      ]
    },
    {
      "metadata": {
        "id": "2NMdE1b-7UIH",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Preparaci\u00f3n\n",
        "\n",
        "Primero, descarguemos el conjunto de datos, importemos TensorFlow y otras utilidades, y carguemos los datos en un `DataFrame` de *Pandas*. Ten en cuenta que estos datos son una muestra de los datos de entrenamiento de MNIST originales; tomamos 20,000 filas al azar."
      ]
    },
    {
      "metadata": {
        "id": "4LJ4SD8BWHeh",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "test": {
            "output": "ignore",
            "timeout": 600
          }
        },
        "cellView": "both"
      },
      "source": [
        "from __future__ import print_function\n",
        "\n",
        "import glob\n",
        "import math\n",
        "import os\n",
        "\n",
        "from IPython import display\n",
        "from matplotlib import cm\n",
        "from matplotlib import gridspec\n",
        "from matplotlib import pyplot as plt\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "import seaborn as sns\n",
        "from sklearn import metrics\n",
        "import tensorflow as tf\n",
        "from tensorflow.python.data import Dataset\n",
        "\n",
        "tf.logging.set_verbosity(tf.logging.ERROR)\n",
        "pd.options.display.max_rows = 10\n",
        "pd.options.display.float_format = '{:.1f}'.format\n",
        "\n",
        "mnist_dataframe = pd.read_csv(\n",
        "  \"https://download.mlcc.google.com/mledu-datasets/mnist_train_small.csv\",\n",
        "  sep=\",\",\n",
        "  header=None)\n",
        "\n",
        "# Use just the first 10,000 records for training/validation.\n",
        "mnist_dataframe = mnist_dataframe.head(10000)\n",
        "\n",
        "mnist_dataframe = mnist_dataframe.reindex(np.random.permutation(mnist_dataframe.index))\n",
        "mnist_dataframe.head()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "kg0-25p2mOi0",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " La primera columna contiene la etiqueta de clase. Las columnas restantes contienen los valores de los atributos, uno por p\u00edxel para los valores de p\u00edxel de `28\u00d728=784`. La mayor\u00eda de estos valores de p\u00edxel de 784 son cero; es posible que quieras dedicar un minuto a confirmar que no sean *todos* cero."
      ]
    },
    {
      "metadata": {
        "id": "PQ7vuOwRCsZ1",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ![img](https://www.tensorflow.org/versions/r0.11/images/MNIST-Matrix.png)"
      ]
    },
    {
      "metadata": {
        "id": "dghlqJPIu8UM",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Estos ejemplos son im\u00e1genes de n\u00fameros escritos a mano con una resoluci\u00f3n relativamente baja y alto contraste. Se representa cada uno de los diez d\u00edgitos de `0-9`, con una etiqueta de clase \u00fanica para cada d\u00edgito posible. Por lo tanto, este es un problema de clasificaci\u00f3n de clase m\u00faltiple con 10 clases.\n",
        "\n",
        "Ahora, analicemos las etiquetas y atributos, y observemos algunos ejemplos. Observa el uso de `loc`, que nos permite extraer columnas en funci\u00f3n de la ubicaci\u00f3n original, ya que no tenemos una fila de encabezado en este conjunto de datos."
      ]
    },
    {
      "metadata": {
        "id": "JfFWWvMWDFrR",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "test": {
            "output": "ignore",
            "timeout": 600
          }
        }
      },
      "source": [
        "def parse_labels_and_features(dataset):\n",
        "  \"\"\"Extracts labels and features.\n",
        "  \n",
        "  This is a good place to scale or transform the features if needed.\n",
        "  \n",
        "  Args:\n",
        "    dataset: A Pandas `Dataframe`, containing the label on the first column and\n",
        "      monochrome pixel values on the remaining columns, in row major order.\n",
        "  Returns:\n",
        "    A `tuple` `(labels, features)`:\n",
        "      labels: A Pandas `Series`.\n",
        "      features: A Pandas `DataFrame`.\n",
        "  \"\"\"\n",
        "  labels = dataset[0]\n",
        "\n",
        "  # DataFrame.loc index ranges are inclusive at both ends.\n",
        "  features = dataset.loc[:,1:784]\n",
        "  # Scale the data to [0, 1] by dividing out the max value, 255.\n",
        "  features = features / 255\n",
        "\n",
        "  return labels, features"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "mFY_-7vZu8UU",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "training_targets, training_examples = parse_labels_and_features(mnist_dataframe[:7500])\n",
        "training_examples.describe()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "4-Vgg-1zu8Ud",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "validation_targets, validation_examples = parse_labels_and_features(mnist_dataframe[7500:10000])\n",
        "validation_examples.describe()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "wrnAI1v6u8Uh",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Muestra un ejemplo al azar y su etiqueta correspondiente."
      ]
    },
    {
      "metadata": {
        "id": "s-euVJVtu8Ui",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "rand_example = np.random.choice(training_examples.index)\n",
        "_, ax = plt.subplots()\n",
        "ax.matshow(training_examples.loc[rand_example].values.reshape(28, 28))\n",
        "ax.set_title(\"Label: %i\" % training_targets.loc[rand_example])\n",
        "ax.grid(False)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "ScmYX7xdZMXE",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Tarea\u00a01: Crear un modelo lineal para MNIST\n",
        "\n",
        "Primero, creemos un modelo de punto de referencia que sirva como comparaci\u00f3n. El `LinearClassifier` proporciona un conjunto de clasificadores *k* de uno frente a todos, uno para cada clase *k*.\n",
        "\n",
        "Observar\u00e1s que, adem\u00e1s de informar la exactitud y representar la p\u00e9rdida log\u00edstica en el tiempo, tambi\u00e9n mostramos una [**matriz de confusi\u00f3n**](https://es.wikipedia.org/wiki/Matriz_de_confusi%C3%B3n). La matriz de confusi\u00f3n muestra qu\u00e9 clases se clasificaron de forma incorrecta como otras clases. \u00bfQu\u00e9 d\u00edgitos se confundieron con otros?\n",
        "\n",
        "Observa tambi\u00e9n que el error del modelo se rastrea a trav\u00e9s de la funci\u00f3n de `log_loss`. Esta no debe confundirse con la funci\u00f3n de p\u00e9rdida interna de `LinearClassifier` que se us\u00f3 para el entrenamiento."
      ]
    },
    {
      "metadata": {
        "id": "cpoVC4TSdw5Z",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def construct_feature_columns():\n",
        "  \"\"\"Construct the TensorFlow Feature Columns.\n",
        "\n",
        "  Returns:\n",
        "    A set of feature columns\n",
        "  \"\"\" \n",
        "  \n",
        "  # There are 784 pixels in each image. \n",
        "  return set([tf.feature_column.numeric_column('pixels', shape=784)])"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "kMmL89yGeTfz",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Aqu\u00ed, haremos funciones de entrada independientes para el entrenamiento y la predicci\u00f3n. Las anidaremos en `create_training_input_fn()` y `create_predict_input_fn()`, respectivamente, para poder invocarlas para devolver las funciones `_input_fn` correspondientes para pasar nuestras llamadas de `.train()` y `.predict()`."
      ]
    },
    {
      "metadata": {
        "id": "OeS47Bmn5Ms2",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def create_training_input_fn(features, labels, batch_size, num_epochs=None, shuffle=True):\n",
        "  \"\"\"A custom input_fn for sending MNIST data to the estimator for training.\n",
        "\n",
        "  Args:\n",
        "    features: The training features.\n",
        "    labels: The training labels.\n",
        "    batch_size: Batch size to use during training.\n",
        "\n",
        "  Returns:\n",
        "    A function that returns batches of training features and labels during\n",
        "    training.\n",
        "  \"\"\"\n",
        "  def _input_fn(num_epochs=None, shuffle=True):\n",
        "    # Input pipelines are reset with each call to .train(). To ensure model\n",
        "    # gets a good sampling of data, even when number of steps is small, we \n",
        "    # shuffle all the data before creating the Dataset object\n",
        "    idx = np.random.permutation(features.index)\n",
        "    raw_features = {\"pixels\":features.reindex(idx)}\n",
        "    raw_targets = np.array(labels[idx])\n",
        "   \n",
        "    ds = Dataset.from_tensor_slices((raw_features,raw_targets)) # warning: 2GB limit\n",
        "    ds = ds.batch(batch_size).repeat(num_epochs)\n",
        "    \n",
        "    if shuffle:\n",
        "      ds = ds.shuffle(10000)\n",
        "    \n",
        "    # Return the next batch of data.\n",
        "    feature_batch, label_batch = ds.make_one_shot_iterator().get_next()\n",
        "    return feature_batch, label_batch\n",
        "\n",
        "  return _input_fn"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "8zoGWAoohrwS",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def create_predict_input_fn(features, labels, batch_size):\n",
        "  \"\"\"A custom input_fn for sending mnist data to the estimator for predictions.\n",
        "\n",
        "  Args:\n",
        "    features: The features to base predictions on.\n",
        "    labels: The labels of the prediction examples.\n",
        "\n",
        "  Returns:\n",
        "    A function that returns features and labels for predictions.\n",
        "  \"\"\"\n",
        "  def _input_fn():\n",
        "    raw_features = {\"pixels\": features.values}\n",
        "    raw_targets = np.array(labels)\n",
        "    \n",
        "    ds = Dataset.from_tensor_slices((raw_features, raw_targets)) # warning: 2GB limit\n",
        "    ds = ds.batch(batch_size)\n",
        "    \n",
        "        \n",
        "    # Return the next batch of data.\n",
        "    feature_batch, label_batch = ds.make_one_shot_iterator().get_next()\n",
        "    return feature_batch, label_batch\n",
        "\n",
        "  return _input_fn"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "G6DjSLZMu8Um",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def train_linear_classification_model(\n",
        "    learning_rate,\n",
        "    steps,\n",
        "    batch_size,\n",
        "    training_examples,\n",
        "    training_targets,\n",
        "    validation_examples,\n",
        "    validation_targets):\n",
        "  \"\"\"Trains a linear classification model for the MNIST digits dataset.\n",
        "  \n",
        "  In addition to training, this function also prints training progress information,\n",
        "  a plot of the training and validation loss over time, and a confusion\n",
        "  matrix.\n",
        "  \n",
        "  Args:\n",
        "    learning_rate: A `float`, the learning rate to use.\n",
        "    steps: A non-zero `int`, the total number of training steps. A training step\n",
        "      consists of a forward and backward pass using a single batch.\n",
        "    batch_size: A non-zero `int`, the batch size.\n",
        "    training_examples: A `DataFrame` containing the training features.\n",
        "    training_targets: A `DataFrame` containing the training labels.\n",
        "    validation_examples: A `DataFrame` containing the validation features.\n",
        "    validation_targets: A `DataFrame` containing the validation labels.\n",
        "      \n",
        "  Returns:\n",
        "    The trained `LinearClassifier` object.\n",
        "  \"\"\"\n",
        "\n",
        "  periods = 10\n",
        "\n",
        "  steps_per_period = steps / periods  \n",
        "  # Create the input functions.\n",
        "  predict_training_input_fn = create_predict_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  predict_validation_input_fn = create_predict_input_fn(\n",
        "    validation_examples, validation_targets, batch_size)\n",
        "  training_input_fn = create_training_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  \n",
        "  # Create a LinearClassifier object.\n",
        "  my_optimizer = tf.train.AdagradOptimizer(learning_rate=learning_rate)\n",
        "  my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n",
        "  classifier = tf.estimator.LinearClassifier(\n",
        "      feature_columns=construct_feature_columns(),\n",
        "      n_classes=10,\n",
        "      optimizer=my_optimizer,\n",
        "      config=tf.estimator.RunConfig(keep_checkpoint_max=1)\n",
        "  )\n",
        "\n",
        "  # Train the model, but do so inside a loop so that we can periodically assess\n",
        "  # loss metrics.\n",
        "  print(\"Training model...\")\n",
        "  print(\"LogLoss error (on validation data):\")\n",
        "  training_errors = []\n",
        "  validation_errors = []\n",
        "  for period in range (0, periods):\n",
        "    # Train the model, starting from the prior state.\n",
        "    classifier.train(\n",
        "        input_fn=training_input_fn,\n",
        "        steps=steps_per_period\n",
        "    )\n",
        "  \n",
        "    # Take a break and compute probabilities.\n",
        "    training_predictions = list(classifier.predict(input_fn=predict_training_input_fn))\n",
        "    training_probabilities = np.array([item['probabilities'] for item in training_predictions])\n",
        "    training_pred_class_id = np.array([item['class_ids'][0] for item in training_predictions])\n",
        "    training_pred_one_hot = tf.keras.utils.to_categorical(training_pred_class_id,10)\n",
        "        \n",
        "    validation_predictions = list(classifier.predict(input_fn=predict_validation_input_fn))\n",
        "    validation_probabilities = np.array([item['probabilities'] for item in validation_predictions])    \n",
        "    validation_pred_class_id = np.array([item['class_ids'][0] for item in validation_predictions])\n",
        "    validation_pred_one_hot = tf.keras.utils.to_categorical(validation_pred_class_id,10)    \n",
        "    \n",
        "    # Compute training and validation errors.\n",
        "    training_log_loss = metrics.log_loss(training_targets, training_pred_one_hot)\n",
        "    validation_log_loss = metrics.log_loss(validation_targets, validation_pred_one_hot)\n",
        "    # Occasionally print the current loss.\n",
        "    print(\"  period %02d : %0.2f\" % (period, validation_log_loss))\n",
        "    # Add the loss metrics from this period to our list.\n",
        "    training_errors.append(training_log_loss)\n",
        "    validation_errors.append(validation_log_loss)\n",
        "  print(\"Model training finished.\")\n",
        "  # Remove event files to save disk space.\n",
        "  _ = map(os.remove, glob.glob(os.path.join(classifier.model_dir, 'events.out.tfevents*')))\n",
        "  \n",
        "  # Calculate final predictions (not probabilities, as above).\n",
        "  final_predictions = classifier.predict(input_fn=predict_validation_input_fn)\n",
        "  final_predictions = np.array([item['class_ids'][0] for item in final_predictions])\n",
        "  \n",
        "  \n",
        "  accuracy = metrics.accuracy_score(validation_targets, final_predictions)\n",
        "  print(\"Final accuracy (on validation data): %0.2f\" % accuracy)\n",
        "\n",
        "  # Output a graph of loss metrics over periods.\n",
        "  plt.ylabel(\"LogLoss\")\n",
        "  plt.xlabel(\"Periods\")\n",
        "  plt.title(\"LogLoss vs. Periods\")\n",
        "  plt.plot(training_errors, label=\"training\")\n",
        "  plt.plot(validation_errors, label=\"validation\")\n",
        "  plt.legend()\n",
        "  plt.show()\n",
        "  \n",
        "  # Output a plot of the confusion matrix.\n",
        "  cm = metrics.confusion_matrix(validation_targets, final_predictions)\n",
        "  # Normalize the confusion matrix by row (i.e by the number of samples\n",
        "  # in each class).\n",
        "  cm_normalized = cm.astype(\"float\") / cm.sum(axis=1)[:, np.newaxis]\n",
        "  ax = sns.heatmap(cm_normalized, cmap=\"bone_r\")\n",
        "  ax.set_aspect(1)\n",
        "  plt.title(\"Confusion matrix\")\n",
        "  plt.ylabel(\"True label\")\n",
        "  plt.xlabel(\"Predicted label\")\n",
        "  plt.show()\n",
        "\n",
        "  return classifier"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "ItHIUyv2u8Ur",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " **Dedica 5\u00a0minutos a observar qu\u00e9 nivel de eficacia tienes con respecto a la exactitud en un modelo lineal de esta forma. Para este ejercicio, lim\u00edtate a experimentar con los hiperpar\u00e1metros para el tama\u00f1o del lote, la tasa de aprendizaje y los pasos.**\n",
        "\n",
        "Detente si obtienes alg\u00fan resultado de una exactitud por encima de 0.9 aproximadamente."
      ]
    },
    {
      "metadata": {
        "id": "yaiIhIQqu8Uv",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "classifier = train_linear_classification_model(\n",
        "             learning_rate=0.02,\n",
        "             steps=100,\n",
        "             batch_size=10,\n",
        "             training_examples=training_examples,\n",
        "             training_targets=training_targets,\n",
        "             validation_examples=validation_examples,\n",
        "             validation_targets=validation_targets)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "266KQvZoMxMv",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### Soluci\u00f3n\n",
        "\n",
        "Haz clic m\u00e1s abajo para conocer una soluci\u00f3n posible."
      ]
    },
    {
      "metadata": {
        "id": "lRWcn24DM3qa",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Aqu\u00ed se incluye un conjunto de par\u00e1metros que deber\u00eda dar una exactitud de aproximadamente 0.9."
      ]
    },
    {
      "metadata": {
        "id": "TGlBMrUoM1K_",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "_ = train_linear_classification_model(\n",
        "    learning_rate=0.03,\n",
        "    steps=1000,\n",
        "    batch_size=30,\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "mk095OfpPdOx",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Tarea\u00a02: Reemplazar el clasificador lineal por una red neural\n",
        "\n",
        "**Reemplaza el LinearClassifier anterior por un [`DNNClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) y busca una combinaci\u00f3n de par\u00e1metros que d\u00e9 una exactitud de 0.95 o mejor.**\n",
        "\n",
        "Es posible que quieras experimentar con otros m\u00e9todos de regularizaci\u00f3n, como de retirados. Estos m\u00e9todos de regularizaci\u00f3n adicionales est\u00e1n documentados en los comentarios de la clase `DNNClassifier`."
      ]
    },
    {
      "metadata": {
        "id": "rm8P_Ttwu8U4",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "#\n",
        "# YOUR CODE HERE: Replace the linear classifier with a neural network.\n",
        "#"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "TOfmiSvqu8U9",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Una vez que tengas un buen modelo, comprueba no haber sobreajustado el conjunto de validaci\u00f3n al evaluar los datos de prueba que cargaremos m\u00e1s abajo.\n",
        ""
      ]
    },
    {
      "metadata": {
        "id": "evlB5ubzu8VJ",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "mnist_test_dataframe = pd.read_csv(\n",
        "  \"https://download.mlcc.google.com/mledu-datasets/mnist_test.csv\",\n",
        "  sep=\",\",\n",
        "  header=None)\n",
        "\n",
        "test_targets, test_examples = parse_labels_and_features(mnist_test_dataframe)\n",
        "test_examples.describe()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "PDuLd2Hcu8VL",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "#\n",
        "# YOUR CODE HERE: Calculate accuracy on the test set.\n",
        "#"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "6sfw3LH0Oycm",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### Soluci\u00f3n\n",
        "\n",
        "Haz clic m\u00e1s abajo para conocer una soluci\u00f3n posible."
      ]
    },
    {
      "metadata": {
        "id": "XatDGFKEO374",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " El c\u00f3digo que aparece m\u00e1s abajo es casi id\u00e9ntico al c\u00f3digo de entrenamiento de `LinearClassifer` original, a excepci\u00f3n de la configuraci\u00f3n espec\u00edfica de la red neural, como el hiperpar\u00e1metro para unidades ocultas."
      ]
    },
    {
      "metadata": {
        "id": "kdNTx8jkPQUx",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def train_nn_classification_model(\n",
        "    learning_rate,\n",
        "    steps,\n",
        "    batch_size,\n",
        "    hidden_units,\n",
        "    training_examples,\n",
        "    training_targets,\n",
        "    validation_examples,\n",
        "    validation_targets):\n",
        "  \"\"\"Trains a neural network classification model for the MNIST digits dataset.\n",
        "  \n",
        "  In addition to training, this function also prints training progress information,\n",
        "  a plot of the training and validation loss over time, as well as a confusion\n",
        "  matrix.\n",
        "  \n",
        "  Args:\n",
        "    learning_rate: A `float`, the learning rate to use.\n",
        "    steps: A non-zero `int`, the total number of training steps. A training step\n",
        "      consists of a forward and backward pass using a single batch.\n",
        "    batch_size: A non-zero `int`, the batch size.\n",
        "    hidden_units: A `list` of int values, specifying the number of neurons in each layer.\n",
        "    training_examples: A `DataFrame` containing the training features.\n",
        "    training_targets: A `DataFrame` containing the training labels.\n",
        "    validation_examples: A `DataFrame` containing the validation features.\n",
        "    validation_targets: A `DataFrame` containing the validation labels.\n",
        "      \n",
        "  Returns:\n",
        "    The trained `DNNClassifier` object.\n",
        "  \"\"\"\n",
        "\n",
        "  periods = 10\n",
        "  # Caution: input pipelines are reset with each call to train. \n",
        "  # If the number of steps is small, your model may never see most of the data.  \n",
        "  # So with multiple `.train` calls like this you may want to control the length \n",
        "  # of training with num_epochs passed to the input_fn. Or, you can do a really-big shuffle, \n",
        "  # or since it's in-memory data, shuffle all the data in the `input_fn`.\n",
        "  steps_per_period = steps / periods  \n",
        "  # Create the input functions.\n",
        "  predict_training_input_fn = create_predict_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  predict_validation_input_fn = create_predict_input_fn(\n",
        "    validation_examples, validation_targets, batch_size)\n",
        "  training_input_fn = create_training_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  \n",
        "  # Create the input functions.\n",
        "  predict_training_input_fn = create_predict_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  predict_validation_input_fn = create_predict_input_fn(\n",
        "    validation_examples, validation_targets, batch_size)\n",
        "  training_input_fn = create_training_input_fn(\n",
        "    training_examples, training_targets, batch_size)\n",
        "  \n",
        "  # Create feature columns.\n",
        "  feature_columns = [tf.feature_column.numeric_column('pixels', shape=784)]\n",
        "\n",
        "  # Create a DNNClassifier object.\n",
        "  my_optimizer = tf.train.AdagradOptimizer(learning_rate=learning_rate)\n",
        "  my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n",
        "  classifier = tf.estimator.DNNClassifier(\n",
        "      feature_columns=feature_columns,\n",
        "      n_classes=10,\n",
        "      hidden_units=hidden_units,\n",
        "      optimizer=my_optimizer,\n",
        "      config=tf.contrib.learn.RunConfig(keep_checkpoint_max=1)\n",
        "  )\n",
        "\n",
        "  # Train the model, but do so inside a loop so that we can periodically assess\n",
        "  # loss metrics.\n",
        "  print(\"Training model...\")\n",
        "  print(\"LogLoss error (on validation data):\")\n",
        "  training_errors = []\n",
        "  validation_errors = []\n",
        "  for period in range (0, periods):\n",
        "    # Train the model, starting from the prior state.\n",
        "    classifier.train(\n",
        "        input_fn=training_input_fn,\n",
        "        steps=steps_per_period\n",
        "    )\n",
        "  \n",
        "    # Take a break and compute probabilities.\n",
        "    training_predictions = list(classifier.predict(input_fn=predict_training_input_fn))\n",
        "    training_probabilities = np.array([item['probabilities'] for item in training_predictions])\n",
        "    training_pred_class_id = np.array([item['class_ids'][0] for item in training_predictions])\n",
        "    training_pred_one_hot = tf.keras.utils.to_categorical(training_pred_class_id,10)\n",
        "        \n",
        "    validation_predictions = list(classifier.predict(input_fn=predict_validation_input_fn))\n",
        "    validation_probabilities = np.array([item['probabilities'] for item in validation_predictions])    \n",
        "    validation_pred_class_id = np.array([item['class_ids'][0] for item in validation_predictions])\n",
        "    validation_pred_one_hot = tf.keras.utils.to_categorical(validation_pred_class_id,10)    \n",
        "    \n",
        "    # Compute training and validation errors.\n",
        "    training_log_loss = metrics.log_loss(training_targets, training_pred_one_hot)\n",
        "    validation_log_loss = metrics.log_loss(validation_targets, validation_pred_one_hot)\n",
        "    # Occasionally print the current loss.\n",
        "    print(\"  period %02d : %0.2f\" % (period, validation_log_loss))\n",
        "    # Add the loss metrics from this period to our list.\n",
        "    training_errors.append(training_log_loss)\n",
        "    validation_errors.append(validation_log_loss)\n",
        "  print(\"Model training finished.\")\n",
        "  # Remove event files to save disk space.\n",
        "  _ = map(os.remove, glob.glob(os.path.join(classifier.model_dir, 'events.out.tfevents*')))\n",
        "  \n",
        "  # Calculate final predictions (not probabilities, as above).\n",
        "  final_predictions = classifier.predict(input_fn=predict_validation_input_fn)\n",
        "  final_predictions = np.array([item['class_ids'][0] for item in final_predictions])\n",
        "  \n",
        "  \n",
        "  accuracy = metrics.accuracy_score(validation_targets, final_predictions)\n",
        "  print(\"Final accuracy (on validation data): %0.2f\" % accuracy)\n",
        "\n",
        "  # Output a graph of loss metrics over periods.\n",
        "  plt.ylabel(\"LogLoss\")\n",
        "  plt.xlabel(\"Periods\")\n",
        "  plt.title(\"LogLoss vs. Periods\")\n",
        "  plt.plot(training_errors, label=\"training\")\n",
        "  plt.plot(validation_errors, label=\"validation\")\n",
        "  plt.legend()\n",
        "  plt.show()\n",
        "  \n",
        "  # Output a plot of the confusion matrix.\n",
        "  cm = metrics.confusion_matrix(validation_targets, final_predictions)\n",
        "  # Normalize the confusion matrix by row (i.e by the number of samples\n",
        "  # in each class).\n",
        "  cm_normalized = cm.astype(\"float\") / cm.sum(axis=1)[:, np.newaxis]\n",
        "  ax = sns.heatmap(cm_normalized, cmap=\"bone_r\")\n",
        "  ax.set_aspect(1)\n",
        "  plt.title(\"Confusion matrix\")\n",
        "  plt.ylabel(\"True label\")\n",
        "  plt.xlabel(\"Predicted label\")\n",
        "  plt.show()\n",
        "\n",
        "  return classifier"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "ZfzsTYGPPU8I",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "classifier = train_nn_classification_model(\n",
        "    learning_rate=0.05,\n",
        "    steps=1000,\n",
        "    batch_size=30,\n",
        "    hidden_units=[100, 100],\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "qXvrOgtUR-zD",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " A continuaci\u00f3n, verificaremos la exactitud del conjunto de prueba."
      ]
    },
    {
      "metadata": {
        "id": "scQNpDePSFjt",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "mnist_test_dataframe = pd.read_csv(\n",
        "  \"https://download.mlcc.google.com/mledu-datasets/mnist_test.csv\",\n",
        "  sep=\",\",\n",
        "  header=None)\n",
        "\n",
        "test_targets, test_examples = parse_labels_and_features(mnist_test_dataframe)\n",
        "test_examples.describe()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "EVaWpWKvSHmu",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "predict_test_input_fn = create_predict_input_fn(\n",
        "    test_examples, test_targets, batch_size=100)\n",
        "\n",
        "test_predictions = classifier.predict(input_fn=predict_test_input_fn)\n",
        "test_predictions = np.array([item['class_ids'][0] for item in test_predictions])\n",
        "  \n",
        "accuracy = metrics.accuracy_score(test_targets, test_predictions)\n",
        "print(\"Accuracy on test data: %0.2f\" % accuracy)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "WX2mQBAEcisO",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Tarea\u00a03: Visualizar las ponderaciones de la primera capa oculta\n",
        "\n",
        "Dediquemos unos minutos a examinar nuestra red neural y observar qu\u00e9 aprendi\u00f3 al acceder al atributo `weights_` de nuestro modelo.\n",
        "\n",
        "La capa de entrada de nuestro modelo tiene `784` ponderaciones correspondientes a las im\u00e1genes de entrada de `28\u00d728` p\u00edxeles. La primera capa oculta tendr\u00e1 `784\u00d7N` ponderaciones, donde `N` es el n\u00famero de nodos en esa capa. Podemos regresar esas ponderaciones a im\u00e1genes de `28\u00d728` al *cambiar la forma* de cada una de las matrices `N` de `1\u00d7784` de las ponderaciones a matrices `N` de un tama\u00f1o de `28\u00d728`.\n",
        "\n",
        "Ejecuta la siguiente celda para representar las ponderaciones. Ten en cuenta que esta celda requiere que ya se haya entrenado un `DNNClassifier` denominado \"clasificador\"."
      ]
    },
    {
      "metadata": {
        "id": "eUC0Z8nbafgG",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "test": {
            "output": "ignore",
            "timeout": 600
          }
        },
        "cellView": "both"
      },
      "source": [
        "print(classifier.get_variable_names())\n",
        "\n",
        "weights0 = classifier.get_variable_value(\"dnn/hiddenlayer_0/kernel\")\n",
        "\n",
        "print(\"weights0 shape:\", weights0.shape)\n",
        "\n",
        "num_nodes = weights0.shape[1]\n",
        "num_rows = int(math.ceil(num_nodes / 10.0))\n",
        "fig, axes = plt.subplots(num_rows, 10, figsize=(20, 2 * num_rows))\n",
        "for coef, ax in zip(weights0.T, axes.ravel()):\n",
        "    # Weights in coef is reshaped from 1x784 to 28x28.\n",
        "    ax.matshow(coef.reshape(28, 28), cmap=plt.cm.pink)\n",
        "    ax.set_xticks(())\n",
        "    ax.set_yticks(())\n",
        "\n",
        "plt.show()"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "kL8MEhNgrx9N",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " La primera capa oculta de la red neural debe modelar atributos de un nivel bastante bajo, de manera que la visualizaci\u00f3n de las ponderaciones probablemente mostrar\u00e1 algunas manchas borrosas o, posiblemente, partes de d\u00edgitos. Es posible que tambi\u00e9n veas algunas neuronas que b\u00e1sicamente sean inconsistencias; estas no tienen convergencia o se ignoran por parte de las capas m\u00e1s altas.\n",
        "\n",
        "Puede resultar interesante detener el entrenamiento en diferentes n\u00fameros de iteraciones y observar el efecto.\n",
        "\n",
        "**Entrena el clasificador para 10, 100 y 1,000 pasos, respectivamente. A continuaci\u00f3n, vuelve a ejecutar esta visualizaci\u00f3n.**\n",
        "\n",
        "\u00bfQu\u00e9 diferencias observas visualmente para los diferentes niveles de convergencia?"
      ]
    }
  ]
}
